Revving up <sup>13</sup>C NMR shielding predictions across chemical space: benchmarks for atoms-in-molecules kernel machine learning with new data for 134 kilo molecules

نویسندگان

چکیده

Abstract The requirement for accelerated and quantitatively accurate screening of nuclear magnetic resonance spectra across the small molecules chemical compound space is two-fold: (1) a robust ‘local’ machine learning (ML) strategy capturing effect neighborhood on an atom’s ‘near-sighted’ property—chemical shielding; (2) reference dataset generated with state-of-the-art first-principles method training. Herein we report QM9-NMR comprising isotropic shielding over 0.8 million C atoms in 134k QM9 gas five common solvent phases. Using these data training, present benchmark results prediction transferability kernel-ridge regression models popular local descriptors. Our best model, trained 100k samples, accurately predicts 50k ‘hold-out’ mean error less than 1.9 ppm. For rapid new query molecules, were geometries from inexpensive theory. Furthermore, by using ?-ML strategy, quench below 1.4 Finally, test non-trivial sets that include 10–17 heavy drugs.

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Quantum chemistry structures and properties of 134 kilo molecules

Computational de novo design of new drugs and materials requires rigorous and unbiased exploration of chemical compound space. However, large uncharted territories persist due to its size scaling combinatorially with molecular size. We report computed geometric, energetic, electronic, and thermodynamic properties for 134k stable small organic molecules made up of CHONF. These molecules correspo...

متن کامل

a new approach to credibility premium for zero-inflated poisson models for panel data

هدف اصلی از این تحقیق به دست آوردن و مقایسه حق بیمه باورمندی در مدل های شمارشی گزارش نشده برای داده های طولی می باشد. در این تحقیق حق بیمه های پبش گویی بر اساس توابع ضرر مربع خطا و نمایی محاسبه شده و با هم مقایسه می شود. تمایل به گرفتن پاداش و جایزه یکی از دلایل مهم برای گزارش ندادن تصادفات می باشد و افراد برای استفاده از تخفیف اغلب از گزارش تصادفات با هزینه پائین خودداری می کنند، در این تحقیق ...

15 صفحه اول

Version Space Learning with DNA Molecules

Version space is used in inductive concept learning to represent the hypothesis space where the goal concept is expressed as a conjunction of attribute values. The size of the version space increases exponentially with the number of attributes. We present an efficient method for representing the version space with DNA molecules and demonstrate its effectiveness by experimental results. Primitiv...

متن کامل

Strategy for research of new pharmacologically active molecules from plants for the treatment of pathologies

Herbal medicine, botanical medicine, phytotherapy, alternative medicine or, complimentary medicine are terms used to describe the science of using plant-based materials to treat specific symptoms or diseases. People have strong belief that natural remedies are perfectly safe. Because we have strong ties to traditional culture we use herbs and spices on daily basis. Plants are an abundant natura...

متن کامل

Strategy for research of new pharmacologically active molecules from plants for the treatment of pathologies

Herbal medicine, botanical medicine, phytotherapy, alternative medicine or, complimentary medicine are terms used to describe the science of using plant-based materials to treat specific symptoms or diseases. People have strong belief that natural remedies are perfectly safe. Because we have strong ties to traditional culture we use herbs and spices on daily basis. Plants are an abundant natura...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Machine learning: science and technology

سال: 2021

ISSN: ['2632-2153']

DOI: https://doi.org/10.1088/2632-2153/abe347